Roles & Responsibilities:
- Collaborate with cross-functional teams to understand data requirements and design optimal solutions on the Google Cloud Platform. Design, develop, and maintain ETL pipelines, data integration processes, and data transformation workflows using GCP data services.
- Write efficient, reliable, and maintainable code in Java to implement data processing logic and custom data transformations. Utilize GCP services such as BigQuery, Dataflow, Pub/Sub, and DataProc to build scalable and high-performance data processing solutions.
- Implement data quality checks, data validation, and monitoring mechanisms to ensure the accuracy and integrity of the data. Optimize and fine-tune data pipelines for performance and cost efficiency, making use of GCP best practices.
- Experience in analyzing complex data, organizing raw data and integrating massive datasets from multiple data sources to build subject areas and reusable data products.
- Experience in working in an implementation team from concept to operations, providing deep technical subject matter expertise for successful deployment. Implement methods for automation of all parts of the pipeline to minimize labor in development and production. Proficient in Machine Learning model architecture, data pipeline interaction and metrics interpretation.
- Evaluate client business challenges and work with the team to ensure the best technology solutions. Collaborate with clients to decipher business requirements into technical requirements and deliverables. Support existing GCP Data Management implementation.
Experience & Skills Fitment:
- Strong coding skills in languages such as Python, PySpark, or Scala.
- Hands-on experience with GCP data services such as BigQuery, Dataflow, Pub/Sub, DataProc, and Cloud Storage.
- Experience using Databricks (Data Engineering and Delta Lake components).
- Experience with source control tools such as GitHub and related dev process.
- Experience with workflow scheduling tools such as Airflow.
- Strong understanding of data structures and algorithms.
- Experience building data lake solutions leveraging Google Data Products (e.g., DataProc, AI Building Blocks, Looker, Cloud Data Fusion, Dataprep, etc.), Hive.
Good to have:
- Excellent problem-solving and analytical skills.
- Strong communication and collaboration skills.
- Ability to work independently and as part of a global team.
- Passionate for data solutions.
- Self-motivated and able to work in a fast-paced environment.
- Detail-oriented and committed to delivering high-quality work.
Benefits:
- Kloud9 provides a robust compensation package and a forward-looking opportunity for growth in emerging fields.
Equal Opportunity Employer:
- Kloud9 is an equal opportunity employer and will not discriminate against any employee or applicant on the basis of age, color, disability, gender, national origin, race, religion, sexual orientation, veteran status, or any classification protected by federal, state, or local law.