Impactian EToD™ Developer (2%)
Extraordinary Talent on Demand™
Jakub
Kielbasiewicz
Python Developer
Software Developer in Wrocław, Poland
SQL
Spark
R
Python
Flask
Databricks
Azure
Google Cloud Platform
AWS
ETL
MLOps
PowerBI
Tableau
Airflow
Kafka
HIRE
Python Developer
Impactian EToD™ Developer (2%)
Jakub Kielbasiewicz
Software Developer in Wrocław, Poland
SQL
Spark
R
Python
Flask
Databricks
Azure
Google Cloud Platform
AWS
ETL
MLOps
PowerBI
Tableau
Airflow
Kafka
About
Jakub is a Senior Big Data Engineer, who is a Machine Learning OPS enthusiast, with broad experience in Big Data applications. He has delivered numerous architectures and implementations of data-intensive applications, mainly in cloud environments. He is highly experienced in cloud environments, with strong emphasis on data, ML and AI solutions.
Skills
Languages
R, Python, Scala
Frameworks
Spark, Hadoop, Flask, Django, Kafka
Libraries/APIs
Pandas, Scikit Learn, Matplotlib, Numpy, PySpark, SQLAlchemy, pyTest
Platforms
Azure, AWS, Google Cloud Platform, Databricks
Storage
Cassandra, Apache Hadoop, Relational Database, Data Lake, Delta Lake
Other
Functional Programming
Tools
Tableau, Power BI, SSIS, SSAS, BigQuery, Redshift, Data Factory, Airflow
Experience
Data Engineering
5 years
SQL
5 years
Spark
3 years
Azure
3 years
AWS
2 years
R
2 years
Google Cloud Platform
1 year
Highlight Projects
IoT Data Quality Control
Developed Data Quality solution for IoT implementation in manufacturing company, which allowed assessing the data quality of sensor data.
Personal Project
  • Initial idea of the project was to deliver real-time dashboarding solution using PowerBI, but it has escalated quickly, the final product has involved machine learning algorithms, database re-design and metadata control activities. Along with the dashboard I have proposed a new solution for data quality management in companies - the iterative methodology and step-by-step process of how to work with IoT data quality.
Digital Goods Warehouse and Bookkeeping Web App
Full product to manage digital goods.
digitalwarehou.se
  • I have architected an award-winning project in Azure - datawarehou.se - online application for storing digital goods (like game keys). Along with a full-stack team, we're working on making this solution production-ready.
Snowflake migration
Data warehouse migration to snowflake
HumanN
  • Migrated data warehouse to snowflake.
  • Re-designed reporting environment (Tableau).
  • Tuned ETL performance.


Data Integration
Designed and implemented reporting solution for insurance broker agency.
Toptal
  • Streamlined reporting process.
  • Designed and implemented data warehouse.
  • Implemented ETL process.
  • Allowed company to get insight from their data.


Work Experience
Senior Big Data Engineer
Circle K
|
Sep 2020 - Present
  • Maintained Databricks clusters (performance tuning, clusters scaling, updating, monitoring).
  • Streamlined processes development.
  • Machine Learning deployment and scaling
  • Tuned performance of spark applications (SparkR, pySpark, SparkSQL).
  • Designed, built and supported data processing pipelines built in pyspark, sparksql, sparkR, ADF and Airflow.
  • Maintained and improved of Airflow pipelines.
Spark
Databricks
R
SparkR
Airflow
Azure
Machine Learning
Docker
Kubernetes
Cloud Data Engineer
Roche
|
Apr 2020 - Sep 2020
  • Developed data pipelines for IoT solutions in Python, Azure Synapse, HDInsight, Azure Data Factory, Databricks, CosmosDB.
  • Architected storage solutions and ETL processes (Azure)
  • Performance tuned Databricks.
  • Created, maintained, and streamlined ADF processes.
Azure
Spark
Databricks
SQL
ADF
Synapse
IoT
Python
Data Engineer
Aberdeen
|
Jun 2019 - Apr 2020
  • Developed a Master Data Management - created new ways to cleanup the data, using web scraping and algorithms (Python, PySpark).
  • Performance tuned.
  • Created SQL Server administration - data partitioning, index maintenance, security, backup policies.
  • Improved advanced string matching algorithms - about 500% improvement in edge cases.
  • Successfully migrated product to AWS (re-designed the data product - using AWS EMR, s3, RDS, Lambda, Athena, Glue).
AWS
Python
Spark
SQL
SSIS
ElasticSearch
BI Developer
Ryanair
|
Dec 2018 - Jun 2019
  • Created ETL processes, which allowed company to integrate data during LaudaMotion takeover and use Data Science algorithms to measure fuel consumption.
  • Modeled new data marts focused on integrating big data solutions based on web analytics and marketing with consumer-related Data Warehouse.
  • Developed Data Quality checks.
  • Prepared PoC solutions in Azure(Databricks, ADF), graph databases, Power BI.
ETL
SSIS
SSAS
PowerBI
R
Databricks
Azure
ADF
ETL & Automation Analyst
Ryanair
|
Apr 2018 - Dec 2019
  • Query performance tuning - highly improved performance of ETL processes required for reporting solutions in Marketing team.
  • Data integration automation (R, Python, SQL, MDX) - eliminated human factor from ETL process completely (full automation).
  • Resolved Tableau issues.
ETL
R
Python
MDX
Impala
Hadoop
Bash
SQL
BI Consultant
Tech Data Client Solutions
|
Jun 2016 - Apr 2018
  • Maintained current workloads.
  • Resolved ETL issues.
  • Created end-to-end reporting environment and solutions for Tesco Mobile (Microstrategy) - developed Microstrategy solution to allow data governance.
Microstrategy
TSQL
SSIS
SSAS
PowerBI
ETL
Education
Wroclaw University of Science and Technology
Wrocław, Poland
|
Oct 2016 - Feb 2020
Bachelor's Degree in Computer Science
University of Wroclaw
Wrocław, Poland
|
Oct 2014 - Jun 2017
Bacherlor's Degree in Economics
Certifications
Microsoft Certified: Azure AI Fundamentals
Aug 2020 - Permanent
Microsoft
Microsoft Certified: Azure Data Engineer Associate
Jul 2020 - Jul 2022
Microsoft
AWS: Cloud Practitioner
May 2020 - May 2022
AWS
GCP: Associate Cloud Engineer
Dec 2019 - Dec 2021
Google
Microsoft Certified: Azure Administrator Associate
Apr 2019 - Apr 2021
Microsoft
MCSE: Data Management and Analytics
Mar 2019 - Permanent
Microsoft
MCSA: SQL BI Reporting
Sep 2019 - Permanent
Microsoft
MCSA: SQL 2016 BI Development
Jun 2018 - Permanent
Microsoft
MCSA: SQL 2016 Database Administration
Mar 2019 - Permanent
Microsoft
MCSA: SQL 2016 Database Development
Apr 2018 - Permanent
Microsoft
About
Jakub is a Senior Big Data Engineer, who is a Machine Learning OPS enthusiast, with broad experience in Big Data applications. He has delivered numerous architectures and implementations of data-intensive applications, mainly in cloud environments. He is highly experienced in cloud environments, with strong emphasis on data, ML and AI solutions.
Skills
Languages
R, Python, Scala
Frameworks
Spark, Hadoop, Flask, Django, Kafka
Libraries/APIs
Pandas, Scikit Learn, Matplotlib, Numpy, PySpark, SQLAlchemy, pyTest
Platforms
Azure, AWS, Google Cloud Platform, Databricks
Storage
Cassandra, Apache Hadoop, Relational Database, Data Lake, Delta Lake
Other
Functional Programming
Tools
Tableau, Power BI, SSIS, SSAS, BigQuery, Redshift, Data Factory, Airflow
Experience
Data Engineering
5 years
SQL
5 years
Spark
3 years
Azure
3 years
AWS
2 years
R
2 years
Google Cloud Platform
1 year
Highlight Projects
IoT Data Quality Control
Developed Data Quality solution for IoT implementation in manufacturing company, which allowed assessing the data quality of sensor data.
Personal Project
  • Initial idea of the project was to deliver real-time dashboarding solution using PowerBI, but it has escalated quickly, the final product has involved machine learning algorithms, database re-design and metadata control activities. Along with the dashboard I have proposed a new solution for data quality management in companies - the iterative methodology and step-by-step process of how to work with IoT data quality.
Digital Goods Warehouse and Bookkeeping Web App
Full product to manage digital goods.
digitalwarehou.se
  • I have architected an award-winning project in Azure - datawarehou.se - online application for storing digital goods (like game keys). Along with a full-stack team, we're working on making this solution production-ready.
Snowflake migration
Data warehouse migration to snowflake
HumanN
  • Migrated data warehouse to snowflake.
  • Re-designed reporting environment (Tableau).
  • Tuned ETL performance.


Data Integration
Designed and implemented reporting solution for insurance broker agency.
Toptal
  • Streamlined reporting process.
  • Designed and implemented data warehouse.
  • Implemented ETL process.
  • Allowed company to get insight from their data.


Work Experience
Senior Big Data Engineer
Circle K | Sep 2020 - Present
  • Maintained Databricks clusters (performance tuning, clusters scaling, updating, monitoring).
  • Streamlined processes development.
  • Machine Learning deployment and scaling
  • Tuned performance of spark applications (SparkR, pySpark, SparkSQL).
  • Designed, built and supported data processing pipelines built in pyspark, sparksql, sparkR, ADF and Airflow.
  • Maintained and improved of Airflow pipelines.
Spark
Databricks
R
SparkR
Airflow
Azure
Machine Learning
Docker
Kubernetes
Cloud Data Engineer
Roche | Apr 2020 - Sep 2020
  • Developed data pipelines for IoT solutions in Python, Azure Synapse, HDInsight, Azure Data Factory, Databricks, CosmosDB.
  • Architected storage solutions and ETL processes (Azure)
  • Performance tuned Databricks.
  • Created, maintained, and streamlined ADF processes.
Azure
Spark
Databricks
SQL
ADF
Synapse
IoT
Python
Data Engineer
Aberdeen | Jun 2019 - Apr 2020
  • Developed a Master Data Management - created new ways to cleanup the data, using web scraping and algorithms (Python, PySpark).
  • Performance tuned.
  • Created SQL Server administration - data partitioning, index maintenance, security, backup policies.
  • Improved advanced string matching algorithms - about 500% improvement in edge cases.
  • Successfully migrated product to AWS (re-designed the data product - using AWS EMR, s3, RDS, Lambda, Athena, Glue).
AWS
Python
Spark
SQL
SSIS
ElasticSearch
BI Developer
Ryanair | Dec 2018 - Jun 2019
  • Created ETL processes, which allowed company to integrate data during LaudaMotion takeover and use Data Science algorithms to measure fuel consumption.
  • Modeled new data marts focused on integrating big data solutions based on web analytics and marketing with consumer-related Data Warehouse.
  • Developed Data Quality checks.
  • Prepared PoC solutions in Azure(Databricks, ADF), graph databases, Power BI.
ETL
SSIS
SSAS
PowerBI
R
Databricks
Azure
ADF
ETL & Automation Analyst
Ryanair | Apr 2018 - Dec 2019
  • Query performance tuning - highly improved performance of ETL processes required for reporting solutions in Marketing team.
  • Data integration automation (R, Python, SQL, MDX) - eliminated human factor from ETL process completely (full automation).
  • Resolved Tableau issues.
ETL
R
Python
MDX
Impala
Hadoop
Bash
SQL
BI Consultant
Tech Data Client Solutions | Jun 2016 - Apr 2018
  • Maintained current workloads.
  • Resolved ETL issues.
  • Created end-to-end reporting environment and solutions for Tesco Mobile (Microstrategy) - developed Microstrategy solution to allow data governance.
Microstrategy
TSQL
SSIS
SSAS
PowerBI
ETL
Education
Wroclaw University of Science and Technology
Wrocław, Poland | Oct 2016 - Feb 2020
Bachelor's Degree in Computer Science
University of Wroclaw
Wrocław, Poland | Oct 2014 - Jun 2017
Bacherlor's Degree in Economics
Certifications
Microsoft Certified: Azure AI Fundamentals
Aug 2020 - Permanent
Microsoft
Microsoft Certified: Azure Data Engineer Associate
Jul 2020 - Jul 2022
Microsoft
AWS: Cloud Practitioner
May 2020 - May 2022
AWS
GCP: Associate Cloud Engineer
Dec 2019 - Dec 2021
Google
Microsoft Certified: Azure Administrator Associate
Apr 2019 - Apr 2021
Microsoft
MCSE: Data Management and Analytics
Mar 2019 - Permanent
Microsoft
MCSA: SQL BI Reporting
Sep 2019 - Permanent
Microsoft
MCSA: SQL 2016 BI Development
Jun 2018 - Permanent
Microsoft
MCSA: SQL 2016 Database Administration
Mar 2019 - Permanent
Microsoft
MCSA: SQL 2016 Database Development
Apr 2018 - Permanent
Microsoft