Job description
Cummins
Supports, develops and maintains a data and analytics platform. Effectively and efficiently processes, stores and delivers data to analysts and other users. Works with business and IT teams to understand requirements and best leverage technologies to enable agile data delivery at scale.
Implements and automates the use of our distributed system to ingest and transform data from multiple sources (relational, event-based, unstructured). Implements methods to continuously monitor and troubleshoot data quality and integrity issues.
data governance processes and methodologies for managing metadata, access and retention of data for internal and external users. Develops reliable, efficient, scalable, and high quality data pipelines with monitoring and alerting mechanisms that combine a variety of sources using ETL/ELT tools or scripting languages.
Continued
Develops physical data models and implements data storage architectures according to the design guidelines. Analyzes complex data elements and systems, data flow, dependencies, and relationships to contribute to conceptual physical and logical data models.
Participates in testing and troubleshooting data pipelines.
Develops and operates large scale data storage and processing solutions using various distributed and cloud based platforms for data storage (e.g. Data Lakes, Hadoop, Hbase, Cassandra, MongoDB, Accumulo, DynamoDB and others).
Leverages agile development technologies such as DevOps, Scrum, Kanban, and continuous improvement cycles for data-driven applications.
Skills
Data Extraction – Performs data extraction, transformation and loading (ETL) activities from a variety of sources and transforms them for use by various downstream applications and users using appropriate tools and technologies.
Documentation – Documents information and solutions based on knowledge gained during product development; communicates with stakeholders with the goal of enabling improvements productivity and effective knowledge transfer to others who were not originally involved in the original learning process.
Quality Assurance Metrics – Applies the science of measurement to assess whether a solution achieves its intended outcomes using the IT Operating Model (ITOM), including SDLC standards, tools, metrics, and performance indicators, to deliver a quality product.
Continued
Solution Validation Testing – Validating a configuration change or solution using best practices defined by the function, including the Systems Development Life Cycle (SDLC) standards, tools, and metrics, to ensure it works as intended and meets customer requirements.
System Requirements Engineering – Uses appropriate methods and tools to translate stakeholder needs into verifiable requirements for which designs are developed; establishes acceptance criteria for the system of interest through analysis, assignment, and negotiation; tracks the status of requirements throughout the system life cycle; evaluates the impact of changes to system requirements on project scope, schedule, and resources; establishes and maintains information links to related artifacts.
Problem Solving – Resolves problems using a systematic analysis process, utilizing industry standard methods to ensure traceability of the problem and protect the customer; determines assignable root cause; implements robust, data-based solutions; identifies systemic root causes and recommends actions to prevent reoccurrence of the problem.
Continued
Data Quality – Identify, understand, and correct data errors to support effective information management in operational business processes and decision making.
Programming – Creates, writes, and tests computer code, test scripts, and build scripts using algorithmic analysis and design, industry standards and tools, version control, and build and test automation to meet business, technical, security, governance, and compliance requirements.
Customer Focus – Building Strong Customer Relationships and the provision of customer-oriented solutions.
Decision Quality – Making good and timely decisions that move the business forward.
Collaboration – Build partnerships and work with others to achieve common goals.
Effective Communication – Developing and delivering communications through a variety of channels that provide a clear understanding of the unique needs of different audiences.
Training, Licenses, Certifications
College, college or equivalent degree preferred or equivalent work experience in a relevant technical field.
Export control or sanctions compliance license may be required for this position.
Experience
Relevant experience is preferred, such as temporary student employment, internship, co-op, or other extracurricular team activities.
Knowledge of the latest technologies in data science is a plus and includes:
- Experience with Big Data Open Source
SPARK, Scala/Java, Map-Reduce, Hive, Hbase, and Kafka or equivalent college coursework - SQL query language
- Experience with cloud-based cluster computing implementations
- Familiarity with developing applications that require large file moves in a cloud-based environment
- Experience with agile software development
- Experiences with the development of analytical solutions
- Experience with IoT technology
Key Qualifications:
- Cloud Computing (Azure) – Azure Data Factory, Azure Databricks and Azure Synapse Analytics, Azure Data Lake Storage, Azure SQL Databases
- Relational Databases – Oracle, SQL Server, PostgreSQL
- Good to know – Power BI, Snowflake