Data Scientist (Machine Learning, PySpark, Databricks, Long-Range Forecasting Focus)
Company: Hakkoda
Location: Plano
Posted on: June 2, 2025
Job Description:
About HakkodaHakkoda, an IBM Company, is a modern data
consultancy that empowers data driven organizations to realize the
full value of the Snowflake Data Cloud. We provide consulting and
managed services in data architecture, data engineering, analytics
and data science. We are renowned for bringing our clients deep
expertise, being easy to work with, and being an amazing place to
work! We are looking for curious and creative individuals who want
to be part of a fast-paced, dynamic environment, where everyone's
input and efforts are valued. We hire outstanding individuals and
give them the opportunity to thrive in a collaborative atmosphere
that values learning, growth, and hard work. Our team is
distributed across North America, Latin America, India and Europe.
If you have the desire to be a part of an exciting, challenging,
and rapidly-growing Snowflake consulting services company, and if
you are passionate about making a difference in this world, we
would love to talk to you!.We are seeking a skilled Data Scientist
with 2 to 5 years of experience, specializing in Machine Learning,
PySpark, and Databricks, with a proven track record in long-range
demand and sales forecasting. This role is crucial for the
development and implementation of an automotive OEM's
next-generation Intelligent Forecast Application. The position will
involve building, optimizing, and deploying large-scale machine
learning models for complex, long-term forecasting challenges using
distributed computing frameworks, specifically PySpark on the
Databricks platform. The work will directly support strategic
decision-making across the automotive value chain, including areas
like long-term demand planning, production scheduling, and
inventory optimization.The ideal candidate will have hands-on
experience developing and deploying ML models for forecasting,
particularly long-range predictions, in a production environment
using PySpark and Databricks.This role requires strong technical
skills in machine learning, big data processing, and time series
forecasting, combined with the ability to work effectively within a
technical team to deliver robust and scalable long-range
forecasting solutions.Key Responsibilities:
- Design, develop, and implement scalable and accurate machine
learning models specifically for long-range demand and sales
forecasting challenges.
- Apply advanced time series analysis techniques and integrate
them with machine learning models leveraging PySpark for data
processing and model training on large datasets within the
Databricks environment.
- Implement probabilistic forecasting methods using PySpark to
capture uncertainty in long-range predictions.
- Develop robust solutions for hierarchical and grouped
long-range forecasting on distributed data.
- Build and optimize large-scale data pipelines for ingesting,
cleaning, transforming, and engineering features relevant to
long-range forecasting from diverse, complex automotive datasets
using PySpark on Databricks.
- Develop and implement robust code for model training,
inference, and deployment of long-range forecasting models directly
within the Databricks platform.
- Apply MLOps principles compatible with Databricks workflows for
model versioning, monitoring, retraining, and managing the
lifecycle of long-range ML forecasting models in production.
- Collaborate with Data Engineering and IT Operations to ensure
seamless deployment and operational efficiency of the forecasting
application on Databricks.
- Evaluate long-range forecasting model performance using
relevant metrics (e.g., MAE, RMSE, MAPE, considering metrics
suitable for longer horizons) and optimize models and data
processing pipelines for improved accuracy and efficiency within
the PySpark/Databricks ecosystem.
- Work effectively as part of a technical team, collaborating
with other data scientists, data engineers, and software developers
to integrate ML long-range forecasting solutions into the broader
forecasting application built on Databricks.
- Communicate technical details and forecasting results
effectively within the technical team.Required Qualifications:
- Bachelor's or Master's degree in Data Science, Computer
Science, Statistics, Applied Mathematics, or a closely related
quantitative field.
- 2 to 5 years of hands-on experience in a Data Scientist or
Machine Learning Engineer role.
- Proven experience developing and deploying machine learning
models in a production environment.
- Demonstrated experience in long-range demand and sales
forecasting.
- Significant hands-on experience with PySpark for large-scale
data processing and machine learning.
- Extensive practical experience working with the Databricks
platform, including notebooks, jobs, and ML capabilities.
- Technical Skills:Expert proficiency in PySpark.
- Expert proficiency in the Databricks platform.
- Strong proficiency in Python and SQL.
- Experience with machine learning libraries compatible with
PySpark (e.g., MLlib, or integrating other libraries).
- Experience with advanced time series forecasting techniques and
their implementation.
- Experience with distributed computing concepts and optimization
techniques relevant to PySpark.
- Hands-on experience with a major cloud provider (Azure, AWS, or
GCP) in the context of using Databricks.
- Familiarity with MLOps concepts and tools used in a Databricks
environment.
- Experience with data visualization tools.
- Analytical skills with a deep understanding of machine learning
algorithms and their application to forecasting.
- Ability to troubleshoot and solve complex technical problems
related to big data and machine learning workflows.Preferred
Qualifications:
- Experience with specific long-range forecasting methodologies
and libraries used in a distributed environment.
- Experience with real-time or streaming data processing using
PySpark for near-term forecasting components that might complement
long-range models.
- Familiarity with automotive data types relevant to long-range
forecasting (e.g., economic indicators affecting car sales,
long-term market trends).
- Experience with distributed version control systems (e.g.,
Git).Knowledge of agile development methodologies.Soft Skills:
- Ability to work effectively as part of a technical team.
- Clear and concise communication of technical details and
forecasting results.
- Ability to tackle complex technical challenges and find
efficient solutions.
- Eagerness to learn and adapt to new technologies and
methodologies within the PySpark/Databricks ecosystem and
advancements in long-range forecasting.
- Ability to understand business needs related to long-term
planning.Benefits:- Medical, Dental, Vision.- Life Insurance.- Paid
parental leave.- Flexible PTO Options.- Company Bonus Program.-
Work from home benefits.- Technical training and certifications.-
Robust learning and development opportunities- Trip to Costa
Rica.Hakkoda is committed to fostering diversity, equity, and
inclusion within our teams. A diverse workforce enhances our
ability to serve clients and enriches our culture. We encourage
candidates of all races, genders, sexual orientations, abilities,
and experiences to apply, creating a workplace where everyone can
succeed and thrive.Ready to take your career to the next level?
Apply today and join a team that's shaping the future!!Hakkoda is
an IBM subsidiary which has been acquired by IBM and will be
integrated in the IBM organization. Hakkoda will be the hiring
entity. By Proceeding with this application, you understand that
Hakkoda will share your personal information with other IBM
subsidiaries involved in your recruitment process, wherever these
are located. More information on how IBM protects your personal
information, including the safeguards in case of cross-border data
transfer, are available
#J-18808-Ljbffr
Keywords: Hakkoda, Denton , Data Scientist (Machine Learning, PySpark, Databricks, Long-Range Forecasting Focus), Other , Plano, Texas
Didn't find what you're looking for? Search again!
Loading more jobs...