Senior Big Data Engineer - SAP Concur Complete

Employment type: Full time

Job Description

We help the world run better
At SAP, we keep it simple: you bring your best to us, and we'll bring out the best in you. We're builders touching over 20 industries and 80% of global commerce, and we need your unique talents to help shape what's next. The work is challenging – but it matters. You'll find a place where you can be yourself, prioritize your wellbeing, and truly belong. What's in it for you? Constant learning, skill growth, great benefits, and a team that wants you to grow and succeed.

Join us as a Senior Big Data Engineer supporting the SAP Concur platform, working hybrid in Sao Leopoldo. You will design, build, and evolve the data pipelines and infrastructure that process billions of transactions, receipts, and travel events every day — powering the analytics, machine learning, and operational reporting that millions of businesses depend on.

What You Will Build — and Why It Matters

You will be a hands-on engineer and technical steward of SAP Concur's data platform, owning the full pipeline lifecycle from raw ingestion through curated, analytics-ready data products. Core areas of ownership include:

Scalable batch and streaming data pipelines that ingest, transform, and deliver structured and semi-structured data across the SAP Concur platform — processing petabytes of expense, travel, and invoicing data.

End-to-end ETL/ELT workflows using industry-standard frameworks, ensuring data is accurate, timely, and traceable from source to consumption layer.

Lakehouse and data warehouse architecture — designing and maintaining Bronze/Silver/Gold medallion layers, partition strategies, and table formats (Delta Lake, Apache Iceberg) that balance query performance with storage cost.

Real-time streaming pipelines for high-velocity event data, enabling fraud detection signals, live spend dashboards, and near-real-time notification triggers for the Concur notification service.

Data quality and observability frameworks — implementing automated data validation, schema drift detection, SLA monitoring, and lineage tracking so that downstream consumers can trust every dataset.

Pipeline infrastructure and DevOps — building and maintaining CI/CD workflows for data code, managing infrastructure-as-code (Terraform/CDK), and ensuring robust monitoring and alerting across all pipeline stages.

Collaborative data modelling with analytics engineers, data scientists, and product managers to ensure that canonical data models support both operational and analytical use cases.

Continuous optimization of existing pipelines — reducing processing latency, lowering compute and storage costs, and improving resilience and fault-tolerance across the platform.

What You Bring

Languages & Query Fundamentals

Python as the primary language for data pipeline development — fluent in idiomatic Python, PySpark, and scripting for automation and orchestration.

Advanced SQL for complex transformations, window functions, query optimization, and data modelling across both relational and analytical warehouse environments.

Working knowledge of Scala or Java for interacting with Apache Spark internals, JVM-based big data frameworks, or compiled pipeline components.

Big Data Processing & Frameworks

Expertise in Apache Spark — PySpark, Spark SQL, Structured Streaming, DataFrames, adaptive query execution, and job-level performance tuning (partitioning, caching, broadcast joins, shuffle optimisation).

Experience with distributed lakehouse platforms such as Databricks.

Familiarity with the broader Hadoop ecosystem (HDFS, Hive, YARN) as it applies to legacy migration and hybrid on-premises/cloud architectures.

Experience with real-time stream processing using Apache Kafka (producers, consumers, Kafka Streams) and complementary engines such as Apache Flink or Spark Structured Streaming for exactly-once and low-latency processing.

Data Warehousing & Storage

Understanding of cloud data warehouses — Snowflake, Google BigQuery, or Amazon Redshift

Solid understanding of open table formats: Delta Lake and Apache Iceberg, including ACID transactions, time travel, schema evolution, and compaction strategies.

Familiarity with data lake storage on AWS S3, Azure Data Lake Storage (ADLS), or Google Cloud Storage, and the trade-offs between lake, warehouse, and lakehouse architectures.

Experience with NoSQL and document stores (DynamoDB) where applicable to high-throughput, low-latency operational data access patterns.

Orchestration & DataOps

Understanding of Apache Airflow for authoring, scheduling, and monitoring DAG-based pipeline workflows; familiarity with alternatives such as Prefect or Dagster is a plus.

Experience implementing dbt (data build tool) for in-warehouse SQL transformations, testing, documentation, and lineage — including dbt Cloud or dbt Core with version-controlled model management.

Strong DevOps and DataOps practices: CI/CD pipeline design for data code using GitHub Actions or similar tools; infrastructure-as-code with Terraform or AWS CDK; containerised pipeline execution with Docker and Kubernetes.

Understanding of data governance concepts — data lineage, metadata management (Apache Atlas, OpenLineage), data cataloguing, and data contracts — and practical experience applying them to production pipelines.

Cloud Platforms & Infrastructure

Working knowledge of at least one major cloud provider: AWS (S3, Glue, EMR, Kinesis, Lambda, Redshift, RDS), GCP (BigQuery, Dataproc, Pub/Sub, Cloud Composer), or Microsoft Azure (ADLS, Synapse Analytics, Data Factory, Event Hubs).

Comfort deploying and operating workloads in containerized environments — Docker, Kubernetes (EKS/GKE/AKS) — and working with serverless compute for lightweight pipeline tasks.

Experience with cost-aware cloud architecture: query tagging, compute auto-scaling, storage tiering, and right-sizing clusters to balance performance against infrastructure spend.

Familiarity with observability and monitoring tooling relevant to data platforms — Grafana, CloudWatch, Datadog, or Monte Carlo — for pipeline health, data freshness SLAs, and anomaly detection.

Data Quality & Reliability

Experience implementing automated data quality testing frameworks such as Great Expectations or Soda, including row-level validation, schema checks, freshness assertions, and drift alerting.

Understanding of idempotency, exactly-once semantics, and late-arriving data patterns — designing pipelines that can be safely re-run without duplicating or corrupting data.

Collaboration & Leadership

Fluent English for collaborating with global, multi-regional teams across the Americas, EMEA, and APJ.

Ability to partner with data scientists, analytics engineers, product managers, and software engineers — translating business requirements into sound technical data models and pipeline designs.

Proactive communication style — comfortable raising data quality issues, SLA risks, and infrastructure concerns to stakeholders before they become production incidents.

Experience using AI coding assistants (Claude Code, Cursor, or similar) and AI-assisted data quality tooling to accelerate pipeline development and debugging is a plus.

Domain & Platform Knowledge

Familiarity with financial transaction data, expense management, ERP integrations, or travel and hospitality data domains is advantageous.

Experience working within SAP BTP, SAP HANA, or SAP Datasphere data ecosystems is a plus.

Where You Belong

A diverse, inclusive culture where global perspectives shape better products — SAP's workforce spans more than 160 countries.

A hybrid work environment in Sao Leopoldo that blends flexibility with meaningful in-person collaboration.

Cross-cultural, cross-functional teams that support shared learning and collective problem-solving.

Continuous learning through SAP Learning Hub, external conference support, and access to leading-edge data engineering tooling.

A team culture that values clean, observable, and well-tested data systems — and psychological safety to raise ideas, challenge assumptions, and propose improvements.

Bring out your best
SAP innovations help more than four hundred thousand customers worldwide work together more efficiently and use business insight more effectively. Originally known for leadership in enterprise resource planning (ERP) software, SAP has evolved to become a market leader in end-to-end business application software and related services for database, analytics, intelligent technologies, and experience management. As a cloud company with two hundred million users and more than one hundred thousand employees worldwide, we are purpose-driven and future-focused, with a highly collaborative team ethic and commitment to personal development. Whether connecting global industries, people, or platforms, we help ensure every challenge gets the solution it deserves. At SAP, you can bring out your best.

We win with inclusion
SAP’s culture of inclusion, focus on health and well-being, and flexible working models help ensure that everyone – regardless of background – feels included and can run at their best. At SAP, we believe we are made stronger by the unique capabilities and qualities that each person brings to our company, and we invest in our employees to inspire confidence and help everyone realize their full potential. We ultimately believe in unleashing all talent and creating a better world.

SAP is committed to the values of Equal Employment Opportunity and provides accessibility accommodations to applicants with physical and/or mental disabilities. If you are interested in applying for employment with SAP and are in need of accommodation or special assistance to navigate our website or to complete your application, please send an e-mail with your request to Recruiting Operations Team: Careers@sap.com.

For SAP employees: Only permanent roles are eligible for the SAP Employee Referral Program, according to the eligibility rules set in the SAP Referral Policy. Specific conditions may apply for roles in Vocational Training.

Qualified applicants will receive consideration for employment without regard to their age, race, religion, national origin, ethnicity, gender (including pregnancy, childbirth, et al), sexual orientation, gender identity or expression, protected veteran status, or disability, in compliance with applicable federal, state, and local legal requirements.

Successful candidates might be required to undergo a background verification with an external vendor.

AI Usage in the Recruitment Process

For information on the responsible use of AI in our recruitment process, please refer to our Guidelines for Ethical Usage of AI in the Recruiting Process.

Please note that any violation of these guidelines may result in disqualification from the hiring process.

Requisition ID: 450646 | Work Area: Software-Design and Development | Expected Travel: 0 - 10% | Career Status: Professional | Employment Type: Regular Full Time | Additional Locations: #LI-Hybrid

Apply now

Company benefits

25 (UK) 30 (Germany) 21 (India) days annual leave + bank holidays

Accrued annual leave – 1 day/year up to 30 days (UK)

Open to job sharing

Sabbaticals

Adoption leave – Up to 52 weeks (UK)

Open to part time work for some roles

Returnship

Equity packages

Shared parental leave

Enhanced maternity leave

Fertility benefits

Pregnancy support

On-site childcare

Share options

Electric Car Salary Sacrifice

Gym membership

Dental coverage

Health insurance

Private GP service

Mental health platform access

Life assurance

Life insurance

Enhanced pension match/contribution

Enhanced paternity leave

Travel insurance

Cycle to work scheme

On-site gym

Bike parking

Enhanced sick pay

Emergency leave

Enhanced sick days

Company car

Open to part-time employees

Work from anywhere scheme

Childcare credits

Fertility treatment leave

Pregnancy loss leave

Carer’s leave

Nursery salary sacrifice scheme

Family health insurance

Women’s health leave

Annual bonus

401K

Referral bonus

Joining bonus

Employee discounts

Loyalty programme

Non-contributory pension

Personal development days

Personal development budgets

L&D budget

Language lessons

Learning license

Study support

Studying sabbaticals

Lunch and learns

In house training

Hackathons

Professional subscriptions

Further education support

Working at SAP

Company employees:

107,000

Gender diversity (m:f):

65:35

Hiring in countries

Argentina

Australia

Austria

Bahrain

Belgium

Brazil

Bulgaria

Canada

Chile

China

Colombia

Croatia

Cyprus

Awards & Accreditations

Top 10 - Best Workplace Benefits

Flexa awards 2026

Other jobs you might like

SAP
SAP Analytics & Data Architect - Global Delivery Hub
Buenos Aires, AR
SAP
Senior Full‑Stack Developer - SAP Concur Platform
São Leopoldo, BR
SAP
Senior Data Engineer for People Analytics
Prague 5, CZ

< Back to search

SAP • São Leopoldo, BR

Senior Big Data Engineer - SAP Concur Complete

Employment type: Full time

Apply now

Job Description

What You Will Build — and Why It Matters

Scalable batch and streaming data pipelines that ingest, transform, and deliver structured and semi-structured data across the SAP Concur platform — processing petabytes of expense, travel, and invoicing data.

End-to-end ETL/ELT workflows using industry-standard frameworks, ensuring data is accurate, timely, and traceable from source to consumption layer.

Lakehouse and data warehouse architecture — designing and maintaining Bronze/Silver/Gold medallion layers, partition strategies, and table formats (Delta Lake, Apache Iceberg) that balance query performance with storage cost.

Real-time streaming pipelines for high-velocity event data, enabling fraud detection signals, live spend dashboards, and near-real-time notification triggers for the Concur notification service.

Data quality and observability frameworks — implementing automated data validation, schema drift detection, SLA monitoring, and lineage tracking so that downstream consumers can trust every dataset.

Pipeline infrastructure and DevOps — building and maintaining CI/CD workflows for data code, managing infrastructure-as-code (Terraform/CDK), and ensuring robust monitoring and alerting across all pipeline stages.

Collaborative data modelling with analytics engineers, data scientists, and product managers to ensure that canonical data models support both operational and analytical use cases.

Continuous optimization of existing pipelines — reducing processing latency, lowering compute and storage costs, and improving resilience and fault-tolerance across the platform.

What You Bring

Languages & Query Fundamentals

Python as the primary language for data pipeline development — fluent in idiomatic Python, PySpark, and scripting for automation and orchestration.

Advanced SQL for complex transformations, window functions, query optimization, and data modelling across both relational and analytical warehouse environments.

Working knowledge of Scala or Java for interacting with Apache Spark internals, JVM-based big data frameworks, or compiled pipeline components.

Big Data Processing & Frameworks

Expertise in Apache Spark — PySpark, Spark SQL, Structured Streaming, DataFrames, adaptive query execution, and job-level performance tuning (partitioning, caching, broadcast joins, shuffle optimisation).

Experience with distributed lakehouse platforms such as Databricks.

Familiarity with the broader Hadoop ecosystem (HDFS, Hive, YARN) as it applies to legacy migration and hybrid on-premises/cloud architectures.

Experience with real-time stream processing using Apache Kafka (producers, consumers, Kafka Streams) and complementary engines such as Apache Flink or Spark Structured Streaming for exactly-once and low-latency processing.

Data Warehousing & Storage

Understanding of cloud data warehouses — Snowflake, Google BigQuery, or Amazon Redshift

Solid understanding of open table formats: Delta Lake and Apache Iceberg, including ACID transactions, time travel, schema evolution, and compaction strategies.

Familiarity with data lake storage on AWS S3, Azure Data Lake Storage (ADLS), or Google Cloud Storage, and the trade-offs between lake, warehouse, and lakehouse architectures.

Experience with NoSQL and document stores (DynamoDB) where applicable to high-throughput, low-latency operational data access patterns.

Orchestration & DataOps

Understanding of Apache Airflow for authoring, scheduling, and monitoring DAG-based pipeline workflows; familiarity with alternatives such as Prefect or Dagster is a plus.

Experience implementing dbt (data build tool) for in-warehouse SQL transformations, testing, documentation, and lineage — including dbt Cloud or dbt Core with version-controlled model management.

Strong DevOps and DataOps practices: CI/CD pipeline design for data code using GitHub Actions or similar tools; infrastructure-as-code with Terraform or AWS CDK; containerised pipeline execution with Docker and Kubernetes.

Understanding of data governance concepts — data lineage, metadata management (Apache Atlas, OpenLineage), data cataloguing, and data contracts — and practical experience applying them to production pipelines.

Cloud Platforms & Infrastructure

Working knowledge of at least one major cloud provider: AWS (S3, Glue, EMR, Kinesis, Lambda, Redshift, RDS), GCP (BigQuery, Dataproc, Pub/Sub, Cloud Composer), or Microsoft Azure (ADLS, Synapse Analytics, Data Factory, Event Hubs).

Comfort deploying and operating workloads in containerized environments — Docker, Kubernetes (EKS/GKE/AKS) — and working with serverless compute for lightweight pipeline tasks.

Experience with cost-aware cloud architecture: query tagging, compute auto-scaling, storage tiering, and right-sizing clusters to balance performance against infrastructure spend.

Familiarity with observability and monitoring tooling relevant to data platforms — Grafana, CloudWatch, Datadog, or Monte Carlo — for pipeline health, data freshness SLAs, and anomaly detection.

Data Quality & Reliability

Experience implementing automated data quality testing frameworks such as Great Expectations or Soda, including row-level validation, schema checks, freshness assertions, and drift alerting.

Understanding of idempotency, exactly-once semantics, and late-arriving data patterns — designing pipelines that can be safely re-run without duplicating or corrupting data.

Collaboration & Leadership

Fluent English for collaborating with global, multi-regional teams across the Americas, EMEA, and APJ.

Ability to partner with data scientists, analytics engineers, product managers, and software engineers — translating business requirements into sound technical data models and pipeline designs.

Proactive communication style — comfortable raising data quality issues, SLA risks, and infrastructure concerns to stakeholders before they become production incidents.

Experience using AI coding assistants (Claude Code, Cursor, or similar) and AI-assisted data quality tooling to accelerate pipeline development and debugging is a plus.

Domain & Platform Knowledge

Familiarity with financial transaction data, expense management, ERP integrations, or travel and hospitality data domains is advantageous.

Experience working within SAP BTP, SAP HANA, or SAP Datasphere data ecosystems is a plus.

Where You Belong

A diverse, inclusive culture where global perspectives shape better products — SAP's workforce spans more than 160 countries.

A hybrid work environment in Sao Leopoldo that blends flexibility with meaningful in-person collaboration.

Cross-cultural, cross-functional teams that support shared learning and collective problem-solving.

Continuous learning through SAP Learning Hub, external conference support, and access to leading-edge data engineering tooling.

A team culture that values clean, observable, and well-tested data systems — and psychological safety to raise ideas, challenge assumptions, and propose improvements.

AI Usage in the Recruitment Process

For information on the responsible use of AI in our recruitment process, please refer to our Guidelines for Ethical Usage of AI in the Recruiting Process.

Apply now

Company benefits

25 (UK) 30 (Germany) 21 (India) days annual leave + bank holidays

Accrued annual leave – 1 day/year up to 30 days (UK)

Open to job sharing

Sabbaticals

Adoption leave – Up to 52 weeks (UK)

Open to part time work for some roles

Returnship

Equity packages

Shared parental leave

Enhanced maternity leave

Fertility benefits

Pregnancy support

On-site childcare

Share options

Electric Car Salary Sacrifice

Gym membership

Dental coverage

Health insurance

Private GP service

Mental health platform access

Life assurance

Life insurance

Enhanced pension match/contribution

Enhanced paternity leave

Travel insurance

Cycle to work scheme

On-site gym

Bike parking

Enhanced sick pay

Emergency leave

Enhanced sick days

Company car

Open to part-time employees

Work from anywhere scheme

Childcare credits

Fertility treatment leave

Pregnancy loss leave

Carer’s leave

Nursery salary sacrifice scheme

Family health insurance

Women’s health leave

Annual bonus

401K

Referral bonus

Joining bonus

Employee discounts

Loyalty programme

Non-contributory pension

Personal development days

Personal development budgets

L&D budget

Language lessons

Learning license

Study support

Studying sabbaticals

Lunch and learns

In house training

Hackathons

Professional subscriptions

Further education support

Working at SAP

Company employees:

107,000

Gender diversity (m:f):

65:35

Hiring in countries

Argentina

Australia

Austria

Bahrain

Belgium

Brazil

Bulgaria

Canada

Chile

China

Colombia

Croatia

Cyprus

Awards & Accreditations

Top 10 - Best Workplace Benefits

Flexa awards 2026

Other jobs you might like

SAP
SAP Analytics & Data Architect - Global Delivery Hub
Buenos Aires, AR
SAP
Senior Full‑Stack Developer - SAP Concur Platform
São Leopoldo, BR
SAP
Senior Data Engineer for People Analytics
Prague 5, CZ

Senior Big Data Engineer - SAP Concur Complete

Job Description

Company benefits

Working at SAP

Awards & Accreditations

Other jobs you might like

SAP Analytics & Data Architect - Global Delivery Hub

Senior Full‑Stack Developer - SAP Concur Platform

Senior Data Engineer for People Analytics

Senior Big Data Engineer - SAP Concur Complete

Job Description

Company benefits

Working at SAP

Awards & Accreditations

Other jobs you might like

SAP Analytics & Data Architect - Global Delivery Hub

Senior Full‑Stack Developer - SAP Concur Platform

Senior Data Engineer for People Analytics

Senior Platform Engineer - SAP Business Technology Platform Big Data Team

Data & Analytics Engineer