
/10
Job Description
We help the world run better
At SAP, we keep it simple: you bring your best to us, and we'll bring out the best in you. We're builders touching over 20 industries and 80% of global commerce, and we need your unique talents to help shape what's next. The work is challenging – but it matters. You'll find a place where you can be yourself, prioritize your wellbeing, and truly belong. What's in it for you? Constant learning, skill growth, great benefits, and a team that wants you to grow and succeed.
Join us as a Senior Big Data Engineer supporting the SAP Concur platform, working hybrid in Sao Leopoldo. You will design, build, and evolve the data pipelines and infrastructure that process billions of transactions, receipts, and travel events every day — powering the analytics, machine learning, and operational reporting that millions of businesses depend on.
What You Will Build — and Why It Matters
You will be a hands-on engineer and technical steward of SAP Concur's data platform, owning the full pipeline lifecycle from raw ingestion through curated, analytics-ready data products. Core areas of ownership include:
Scalable batch and streaming data pipelines that ingest, transform, and deliver structured and semi-structured data across the SAP Concur platform — processing petabytes of expense, travel, and invoicing data.
End-to-end ETL/ELT workflows using industry-standard frameworks, ensuring data is accurate, timely, and traceable from source to consumption layer.
Lakehouse and data warehouse architecture — designing and maintaining Bronze/Silver/Gold medallion layers, partition strategies, and table formats (Delta Lake, Apache Iceberg) that balance query performance with storage cost.
Real-time streaming pipelines for high-velocity event data, enabling fraud detection signals, live spend dashboards, and near-real-time notification triggers for the Concur notification service.
Data quality and observability frameworks — implementing automated data validation, schema drift detection, SLA monitoring, and lineage tracking so that downstream consumers can trust every dataset.
Pipeline infrastructure and DevOps — building and maintaining CI/CD workflows for data code, managing infrastructure-as-code (Terraform/CDK), and ensuring robust monitoring and alerting across all pipeline stages.
Collaborative data modelling with analytics engineers, data scientists, and product managers to ensure that canonical data models support both operational and analytical use cases.
Continuous optimization of existing pipelines — reducing processing latency, lowering compute and storage costs, and improving resilience and fault-tolerance across the platform.
What You Bring
Languages & Query Fundamentals
Python as the primary language for data pipeline development — fluent in idiomatic Python, PySpark, and scripting for automation and orchestration.
Advanced SQL for complex transformations, window functions, query optimization, and data modelling across both relational and analytical warehouse environments.
Working knowledge of Scala or Java for interacting with Apache Spark internals, JVM-based big data frameworks, or compiled pipeline components.
Big Data Processing & Frameworks
Expertise in Apache Spark — PySpark, Spark SQL, Structured Streaming, DataFrames, adaptive query execution, and job-level performance tuning (partitioning, caching, broadcast joins, shuffle optimisation).
Experience with distributed lakehouse platforms such as Databricks.
Familiarity with the broader Hadoop ecosystem (HDFS, Hive, YARN) as it applies to legacy migration and hybrid on-premises/cloud architectures.
Experience with real-time stream processing using Apache Kafka (producers, consumers, Kafka Streams) and complementary engines such as Apache Flink or Spark Structured Streaming for exactly-once and low-latency processing.
Data Warehousing & Storage
Understanding of cloud data warehouses — Snowflake, Google BigQuery, or Amazon Redshift
Solid understanding of open table formats: Delta Lake and Apache Iceberg, including ACID transactions, time travel, schema evolution, and compaction strategies.
Familiarity with data lake storage on AWS S3, Azure Data Lake Storage (ADLS), or Google Cloud Storage, and the trade-offs between lake, warehouse, and lakehouse architectures.
Experience with NoSQL and document stores (DynamoDB) where applicable to high-throughput, low-latency operational data access patterns.
Orchestration & DataOps
Understanding of Apache Airflow for authoring, scheduling, and monitoring DAG-based pipeline workflows; familiarity with alternatives such as Prefect or Dagster is a plus.
Experience implementing dbt (data build tool) for in-warehouse SQL transformations, testing, documentation, and lineage — including dbt Cloud or dbt Core with version-controlled model management.
Strong DevOps and DataOps practices: CI/CD pipeline design for data code using GitHub Actions or similar tools; infrastructure-as-code with Terraform or AWS CDK; containerised pipeline execution with Docker and Kubernetes.
Understanding of data governance concepts — data lineage, metadata management (Apache Atlas, OpenLineage), data cataloguing, and data contracts — and practical experience applying them to production pipelines.
Cloud Platforms & Infrastructure
Working knowledge of at least one major cloud provider: AWS (S3, Glue, EMR, Kinesis, Lambda, Redshift, RDS), GCP (BigQuery, Dataproc, Pub/Sub, Cloud Composer), or Microsoft Azure (ADLS, Synapse Analytics, Data Factory, Event Hubs).
Comfort deploying and operating workloads in containerized environments — Docker, Kubernetes (EKS/GKE/AKS) — and working with serverless compute for lightweight pipeline tasks.
Experience with cost-aware cloud architecture: query tagging, compute auto-scaling, storage tiering, and right-sizing clusters to balance performance against infrastructure spend.
Familiarity with observability and monitoring tooling relevant to data platforms — Grafana, CloudWatch, Datadog, or Monte Carlo — for pipeline health, data freshness SLAs, and anomaly detection.
Data Quality & Reliability
Experience implementing automated data quality testing frameworks such as Great Expectations or Soda, including row-level validation, schema checks, freshness assertions, and drift alerting.
Understanding of idempotency, exactly-once semantics, and late-arriving data patterns — designing pipelines that can be safely re-run without duplicating or corrupting data.
Collaboration & Leadership
Fluent English for collaborating with global, multi-regional teams across the Americas, EMEA, and APJ.
Ability to partner with data scientists, analytics engineers, product managers, and software engineers — translating business requirements into sound technical data models and pipeline designs.
Proactive communication style — comfortable raising data quality issues, SLA risks, and infrastructure concerns to stakeholders before they become production incidents.
Experience using AI coding assistants (Claude Code, Cursor, or similar) and AI-assisted data quality tooling to accelerate pipeline development and debugging is a plus.
Domain & Platform Knowledge
Familiarity with financial transaction data, expense management, ERP integrations, or travel and hospitality data domains is advantageous.
Experience working within SAP BTP, SAP HANA, or SAP Datasphere data ecosystems is a plus.
Where You Belong
A diverse, inclusive culture where global perspectives shape better products — SAP's workforce spans more than 160 countries.
A hybrid work environment in Sao Leopoldo that blends flexibility with meaningful in-person collaboration.
Cross-cultural, cross-functional teams that support shared learning and collective problem-solving.
Continuous learning through SAP Learning Hub, external conference support, and access to leading-edge data engineering tooling.
A team culture that values clean, observable, and well-tested data systems — and psychological safety to raise ideas, challenge assumptions, and propose improvements.
Bring out your best
SAP innovations help more than four hundred thousand customers worldwide work together more efficiently and use business insight more effectively. Originally known for leadership in enterprise resource planning (ERP) software, SAP has evolved to become a market leader in end-to-end business application software and related services for database, analytics, intelligent technologies, and experience management. As a cloud company with two hundred million users and more than one hundred thousand employees worldwide, we are purpose-driven and future-focused, with a highly collaborative team ethic and commitment to personal development. Whether connecting global industries, people, or platforms, we help ensure every challenge gets the solution it deserves. At SAP, you can bring out your best.
We win with inclusion
SAP’s culture of inclusion, focus on health and well-being, and flexible working models help ensure that everyone – regardless of background – feels included and can run at their best. At SAP, we believe we are made stronger by the unique capabilities and qualities that each person brings to our company, and we invest in our employees to inspire confidence and help everyone realize their full potential. We ultimately believe in unleashing all talent and creating a better world.
SAP is committed to the values of Equal Employment Opportunity and provides accessibility accommodations to applicants with physical and/or mental disabilities. If you are interested in applying for employment with SAP and are in need of accommodation or special assistance to navigate our website or to complete your application, please send an e-mail with your request to Recruiting Operations Team: Careers@sap.com.
For SAP employees: Only permanent roles are eligible for the SAP Employee Referral Program, according to the eligibility rules set in the SAP Referral Policy. Specific conditions may apply for roles in Vocational Training.
Qualified applicants will receive consideration for employment without regard to their age, race, religion, national origin, ethnicity, gender (including pregnancy, childbirth, et al), sexual orientation, gender identity or expression, protected veteran status, or disability, in compliance with applicable federal, state, and local legal requirements.
Successful candidates might be required to undergo a background verification with an external vendor.
AI Usage in the Recruitment Process
For information on the responsible use of AI in our recruitment process, please refer to our Guidelines for Ethical Usage of AI in the Recruiting Process.
Please note that any violation of these guidelines may result in disqualification from the hiring process.
Requisition ID: 450646 | Work Area: Software-Design and Development | Expected Travel: 0 - 10% | Career Status: Professional | Employment Type: Regular Full Time | Additional Locations: #LI-Hybrid
Company benefits
Working at SAP
Company employees:
Gender diversity (m:f):
Hiring in countries
Argentina
Australia
Austria
Bahrain
Belgium
Brazil
Bulgaria
Canada
Chile
China
Colombia
Croatia
Czechia
Denmark
Egypt
Finland
France
Germany
Greece
Hong Kong
Hungary
India
Indonesia
Ireland
Israel
Italy
Japan
Kazakhstan
Kenya
Kuwait
Malaysia
Mexico
Morocco
Netherlands
New Zealand
Nigeria
Norway
Pakistan
Peru
Philippines
Poland
Portugal
Qatar
Romania
Saudi Arabia
Serbia
Singapore
Slovakia
Slovenia
South Africa
South Korea
Spain
Sweden
Switzerland
Taiwan
Thailand
Türkiye
Ukraine
United Arab Emirates
United Kingdom
United States
Vietnam
Office Locations
Other jobs you might like
Senior Full‑Stack Developer - SAP Concur Platform
São Leopoldo, BR
Transparency8.4/10
RankingSenior Data Scientist - SAP Concur Travel
São Leopoldo, BR
1 Apr
Transparency8.4/10
RankingIT Data & Analytics Senior Specialist
Prague 5, CZ
22 Jan
Transparency8.4/10
RankingSenior Developer (Java), SAP Concur
Bangalore, IN
Transparency8.4/10
RankingDeveloper, SAP Concur
Bangalore, IN
20 Feb
Transparency8.4/10
Ranking