Roles & Responsibilities:
- Clean, prepare and optimize data at scale for ingestion and consumption by machine learning models
- Drive the implementation of new data management projects and re-structure of the current data architecture
- Implement complex automated workflows and routines using workflow scheduling tools
- Build continuous integration, test-driven development and production deployment frameworks
- Drive collaborative reviews of design, code, test plans and dataset implementation performed by other data engineers in support of maintaining data engineering standards
- Anticipate, identify and solve issues concerning data management to improve data quality
- Design and build reusable components, frameworks and libraries at scale to support machine learning products
- Design and implement product features in collaboration with business and Technology stakeholders
- Analyze and profile data for the purpose of designing scalable solutions
- Troubleshoot complex data issues and perform root cause analysis to proactively resolve product and operational issues
- Mentor and develop other data engineers in adopting best practices
- Able to influence and communicate effectively, both verbally and written, with team members and business stakeholders
Experience & Skills Fitment:
- 4+ years of experience developing scalable Pyspark applications or solutions on distributed platforms
- Experience in Google Cloud Platform (GCP) and good to have other cloud platform tools
- Experience working with Data warehousing tools, including DynamoDB, SQL, and Snowflake
- Experience architecting data products in Streaming, Serverless and Microservices Architecture and platform.
- Experience with Pyspark, Spark (Scala/Python/Java) and Kafka
- Work experience with using Databricks (Data Engineering and Delta Lake components)
- Experience working with Big Data platforms, including Dataproc, Data Bricks etc
- Experience working with distributed technology tools including Spark, Presto, Databricks, Airflow
- Working knowledge of Data warehousing, Data modeling
- Experience working in Agile and Scrum development process
- Bachelor's degree in Computer Science, Information Systems, Business, or other relevant subject area
Benefits:
- Kloud9 provides a robust compensation package and a forward-looking opportunity for growth in emerging fields.
Equal Opportunity Employer:
- Kloud9 is an equal opportunity employer and will not discriminate against any employee or applicant on the basis of age, color, disability, gender, national origin, race, religion, sexual orientation, veteran status, or any classification protected by federal, state, or local law.